Information Extraction from Broadcast News Speech Data
نویسندگان
چکیده
In this paper we describe a robust algorithm for information extraction from spoken language data. Our probabilistic algorithm builds on results in language modeling, using classbased smoothing to produce state-of-the-art performance for a wide range of speech error rates. We show that our system performs well with sparse data, as well as with out-of-domain data.
منابع مشابه
Keyphrase Cloud Generation of Broadcast News
This paper describes an enhanced automatic keyphrase extraction method applied to Broadcast News. The keyphrase extraction process is used to create a concept level for each news. On top of words resulting from a speech recognition system output and news indexation and it contributes to the generation of a tag/keyphrase cloud of the top news included in a Multimedia Monitoring Solution system f...
متن کاملInformation Extraction from Broadcast News
This paper discusses the development of trainable statistical models for extracting content from television and radio news broadcasts. In particular we concentrate on statistical finite state models for identifying proper names and other named entities in broadcast speech. Two models are presented: the first represents name class information as a word attribute; the second represents both word-...
متن کاملTopic extraction with multiple topic-words in broadcast-news speech
This paper reports on topic extraction in Japanese broadcastnews speech. We studied, using continuous speech recognition, the extraction of several topic-words from broadcast-news. A combination of multiple topic-words represents the content of the news. This is a more detailed and more flexible approach than using a single word or a single category. A topic-extraction model shows the degree of...
متن کاملLook Who is Talking: Soundbite Speaker Name Recognition in Broadcast News Speech
Speaker name recognition plays an important role in many spoken language applications, such as rich transcription, information extraction, question answering, and opinion mining. In this paper, we developed an SVM-based classification framework to determine the speaker names for those included speech segments in broadcast news speech, called soundbites. We evaluated a variety of features with d...
متن کامل1998 Hub-4 Information Extraction Evaluation
This paper documents the Information Extraction Named-Entity Evaluation (IE-NE), one of the new spokes added to the DARPA-sponsored 1998 Hub-4 Broadcast News Evaluation. This paper discusses the information extraction task as posed for the 1998 Broadcast News Evaluation. This paper reviews the evaluation metrics, the scoring process, and the test corpus that was used for the evaluation. Finally...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999